IN THE CLAIMS: 



This listing of claims will replace all prior versions, and listings, of claims in 
the application: 

1 . (currently amended) A method for performing a gather operation on a 
general purpose computer processor comprising: 

computing addresses for a plurality of data elements of a matrix stored in 
memory, wherein each data element is identified by one of a an equal plurality of 
indices and a base address, and wherein computing addresses comprises 
executing a first an equal plurality of EXTRACT instructions to transfer a plurality 
of said indices from a first storage location where the indices are stored 
substantially contiguously, to an equal plurality of separate storage locations, 
wherein each index is assigned its own separate storage location; 

retrieving each of said plurality of data elements from memory based on 
the computed addresses; and 

executing a s e cond an equal plurality of DEPOSIT instructions, each 
deposit instruction depositing one or more of said data elements contiguously 
with other data elements in a general purpose register. 

2. (currently amended) The method as in claim 1 wherein said storage 
locations are general purpose registers within a general purpose processor . 

3. (previously presented) The method as in claim 1 wherein computing 
addresses further comprises: 

adding each of said indices to a base address. 
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4. (previously presented) The method as in claim 1 further comprising: 
loading each of said data elements from memory into separate storage 

locations prior to executing said second plurality of instructions. 

5. (previously presented) The method as in claim 1 wherein said 
computer processor executes two or more of said first and/or second plurality of 
instructions in a single clock cycle. 

6. (original) The method as in claim 1 further comprising: 
storing each of said data elements on a mass storage device. 

7. (original) The method as in claim 2 wherein said registers are 64-bits 
wide and said data elements are 16-bits in length. 

8. (currently amended) A method for performing a scatter operation on a 
general purpose computer processor comprising: 

executing a first plurality of EXTRACT instructions to extract indices for 
each of a plurality of data elements, the indices being extracted into separate 
storage locations; 

using the extracted indices to calculate ca l culat i ng addresses in memory 
to which a said plurality of data elements are to be scattered to form a matrix in 
memory wherein each address in memory is identified by one of a plurality of 
indices and a base address; 

executing a second plurality of extfact EXTRACT instructions, each of 
said e xtract EXTRACT instructions extracting one or more of said data elements 
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from a general purpose register in which said data elements are stored 
contiguously to an equal plurality of separate storage locations; and 

transferring said data elements from said separate storage locations to 
said calculated addresses in memory. 

9. (previously presented) The method as in claim 8 wherein each of said 
storage location is a general purpose register. 

10. (currently amended) The method as in claim 8 wherein calculating 
addresses comprises: 

oxtracting indic e s for e ach of sa i d data el e m e nts into s e parat e storag e 
l ocations; and 

adding each of said indices to a base address. 

1 1 . (Previously Presented) The method as in claim 8 wherein storing each 
of said data elements is accomplished via a plurality of STORE instructions 
executed by said computer processor. 

12. (Previously Presented) The method as in claim 8 wherein said 
computer processor executes two or more of said instructions in a single clock 
cycle. 

13. (original) The method as in claim 9 wherein said register is 64-bits 
wide and said data elements are 16-bits in length. 
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14. (currently amended) A computer system comprising: 
a memory; 

a general purpose processor communicatively coupled to the memory; 

and 

a storage device communicatively coupled to the processor and having 
stored therein a sequence of instructions which, when executed by the 
processor, causes the processor to at least, 

compute addresses for a plurality of data elements of a matrix stored in 
memory, wherein each data element is identified by one of a an equal plurality of 
indices and a base address, and wherein computing addresses comprises 
executing a first an equal plurality of EXTRACT instructions to transfer a plurality 
of said indices from a first storage location where the indices are stored 
substantially contiguously, to an equal plurality of separate storage locations, 
wherein each index is assigned its own separate storage location; 

retrieve each of said plurality of data elements from memory based on the 
computed addresses; and 

execute a s e cond an equal plurality of DEPOSIT instructions, each 
deposit instruction depositing one or more of said data elements contiguously 
with other data elements in a general purpose register. 

15. (previously presented) The computer system as in claim 14 wherein 
said storage locations are general purpose registers. 

16. (previously presented) The computer system as in claim 14 wherein, 
responsive to one or more instructions in said sequence, said processor 
computes addresses by: 
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adding each of said indices to a base address. 

17. (previously presented) The computer system as in claim 14 wherein 
said processor loads each of said data elements from memory into separate 
storage locations prior to executing said second plurality of instructions. 

18. (previously presented) The computer system as in claim 17 wherein 
said processor executes two or more of said first and/or second plurality of 
instructions in a single clock cycle. 

19. (original) The computer system as in claim 14 wherein, responsive to 
one or more instructions in said sequence, said processor further: 

stores each of said data elements on said mass storage device. 

20. (original) The computer system as in claim 15 wherein said registers 
are 64-bits wide and said data elements are 16-bits in length. 

21. (Previously Presented) A method as in claim 1 wherein computing 
, addresses comprises: 

executing a series of instructions, each instruction to extract an address 
index for one of said plurality of data elements. 

22. (original) The method as in claim 21 wherein said address indices are 
extracted from a series of contiguous memory locations 

23. (Cancelled) 
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